24 research outputs found

    Greykite: Deploying Flexible Forecasting at Scale at LinkedIn

    Full text link
    Forecasts help businesses allocate resources and achieve objectives. At LinkedIn, product owners use forecasts to set business targets, track outlook, and monitor health. Engineers use forecasts to efficiently provision hardware. Developing a forecasting solution to meet these needs requires accurate and interpretable forecasts on diverse time series with sub-hourly to quarterly frequencies. We present Greykite, an open-source Python library for forecasting that has been deployed on over twenty use cases at LinkedIn. Its flagship algorithm, Silverkite, provides interpretable, fast, and highly flexible univariate forecasts that capture effects such as time-varying growth and seasonality, autocorrelation, holidays, and regressors. The library enables self-serve accuracy and trust by facilitating data exploration, model configuration, execution, and interpretation. Our benchmark results show excellent out-of-the-box speed and accuracy on datasets from a variety of domains. Over the past two years, Greykite forecasts have been trusted by Finance, Engineering, and Product teams for resource planning and allocation, target setting and progress tracking, anomaly detection and root cause analysis. We expect Greykite to be useful to forecast practitioners with similar applications who need accurate, interpretable forecasts that capture complex dynamics common to time series related to human activity.Comment: In Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD '22), August 14-18, 2022, Washington, DC, USA. ACM, New York, NY, USA, 11 page

    Vitamin D and cause-specific vascular disease and mortality:a Mendelian randomisation study involving 99,012 Chinese and 106,911 European adults

    Get PDF

    Statistical Aspects of High-Dimensional Sparse Artificial Neural Network Models

    No full text
    An artificial neural network (ANN) is an automatic way of capturing linear and nonlinear correlations, spatial and other structural dependence among features. This machine performs well in many application areas such as classification and prediction from magnetic resonance imaging, spatial data and computer vision tasks. Most commonly used ANNs assume the availability of large training data compared to the dimension of feature vector. However, in modern applications, as mentioned above, the training sample sizes are often low, and may be even lower than the dimension of feature vector. In this paper, we consider a single layer ANN classification model that is suitable for analyzing high-dimensional low sample-size (HDLSS) data. We investigate the theoretical properties of the sparse group lasso regularized neural network and show that under mild conditions, the classification risk converges to the optimal Bayes classifier’s risk (universal consistency). Moreover, we proposed a variation on the regularization term. A few examples in popular research fields are also provided to illustrate the theory and methods

    The Impact Of Global Unknown Teleconnection Patterns On Terrestrial Precipitation Across North And Central America

    No full text
    Global sea surface temperature (SST) anomalies can affect terrestrial precipitation via ocean-atmosphere interactions known as climate teleconnections. Nonstationary and nonlinear characteristics of the teleconnection signals passing through the complex ocean-atmosphere-land system may provide a unique opportunity to quantify large-scale climate variability. This work explores the systematic relationships between global SST anomalies and terrestrial precipitation variability with respect to long-term nonlinear and nonstationary teleconnection signals during 1981–2010 over three regions in North America and one in Central America. The aim of this study was to investigate the surveillance capacity of teleconnections through varying atmospheric pathways toward different types of landscape and geographical environments. After finding possible associations between the dominant variation of seasonal precipitation and global SST anomalies through the integrated empirical mode decomposition, wavelet analysis, and lagged correlation analysis, the statistically significant SST regions were extracted to identify both known and unknown teleconnections. Results indicate that previously unidentified SST regions contribute a salient portion of terrestrial precipitation variability over different terrestrial regions. Central America and Pacific Northwest study sites receive highest probable impacts of climate variability driven by some unknown teleconnections that reveal unique coupling interactions between oceanic and atmospheric processes, implying possible linkages with atmospheric rivers
    corecore